A Divided Regression Analysis for Big Data
نویسندگان
چکیده
Statistics is an important part in big data because many statistical methods are used for big data analysis. The aim of statistics is to estimate population using the sample extracted from the population, so statistics is to analyze not the population but the sample. But in big data environment, we can get the big data set closed to the population by the advanced computing systems such as cloud computing and high-speed internet. According to the circumstances, we can analyze entire part of big data like the population of statistics. But we may be impossible to analyze the entire data because of its huge data volume. So, in this paper, we propose a new analytical methodology for big data analysis in regression problem for reducing the computing burden. We call this a divided regression analysis. To verify the performance of our divided regression model, we carry out experiment and simulation.
منابع مشابه
Design and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملMulti-Objective Model for Fair Pricing of Electricity Using the Parameters from the Iran Electricity Market Big Data Analysis
Assessment of the electricity market shows that, electricity market data can be considered "big data". this data has been analyzed by both conventional and modern data mining methods. The predicted variables of supply and demand are considered to be the input of a defined multi-objective for predicting electricity price, which is the result of the defined model. This shows the advantage of appl...
متن کامل2016 Olympic Games on Twitter: Sentiment Analysis of Sports Fans Tweets using Big Data Framework
Big data analytics is one of the most important subjects in computer science. Today, due to the increasing expansion of Web technology, a large amount of data is available to researchers. Extracting information from these data is one of the requirements for many organizations and business centers. In recent years, the massive amount of Twitter's social networking data has become a platform for ...
متن کاملبررسی رابطه میان ویژگیهای شخصیتی و استراتژیهای خود - رهبری (مورد مطالعه: دانشجویان دانشگاه اصفهان)
Introduction: Self-direction is a process by which people direct and motivate themselves to behave and perform in a desirable way. Self-direction has been used successfully in many fields so far. It seems that personality traits of individuals have a relationship with self-direction skill. The purpose of the study was to investigate the relationship between Big Five Personality Factors (extrove...
متن کاملA Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection
Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015